<!--begin.rcode results='hide', echo=FALSE, message=FALSE
library(caret)
data(BloodBrain)
library(party)
library(pls)
library(klaR)
theme1 <- caretTheme()
theme1$superpose.symbol$col = c(rgb(1, 0, 0, .4), rgb(0, 0, 1, .4), 
  rgb(0.3984375, 0.7578125,0.6445312, .6))
theme1$superpose.symbol$pch = c(15, 16, 17)
theme1$superpose.cex = .8
theme1$superpose.line$col = c(rgb(1, 0, 0, .9), rgb(0, 0, 1, .9), rgb(0.3984375, 0.7578125,0.6445312, .6))
theme1$superpose.line$lwd <- 2
theme1$superpose.line$lty = 1:3
theme1$plot.symbol$col = c(rgb(.2, .2, .2, .4))
theme1$plot.symbol$pch = 16
theme1$plot.cex = .8
theme1$plot.line$col = c(rgb(1, 0, 0, .7))
theme1$plot.line$lwd <- 2
theme1$plot.line$lty = 1

hook_inline = knit_hooks$get('inline')
knit_hooks$set(inline = function(x) {
  if (is.character(x)) highr::hi_html(x) else hook_inline(x)
  })
opts_chunk$set(comment=NA)

library(ellipse)
library(mlbench)
data(Sonar)
session <- paste(format(Sys.time(), "%a %b %d %Y"),
                 "using caret version",
                 packageDescription("caret")$Version,
                 "and",
                 R.Version()$version.string)
    end.rcode-->


<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
  <!--
  Design by Free CSS Templates
http://www.freecsstemplates.org
Released for free under a Creative Commons Attribution 2.5 License

Name       : Emerald 
Description: A two-column, fixed-width design with dark color scheme.
Version    : 1.0
Released   : 20120902

-->
  <html xmlns="http://www.w3.org/1999/xhtml">
  <head>
  <meta name="keywords" content="" />
  <meta name="description" content="" />
  <meta http-equiv="content-type" content="text/html; charset=utf-8" />
  <title>Miscellaneous Model Functions</title>
  <link href='http://fonts.googleapis.com/css?family=Abel' rel='stylesheet' type='text/css'>
  <link href="style.css" rel="stylesheet" type="text/css" media="screen" />
  </head>
  <body>
  <div id="wrapper">
  <div id="header-wrapper" class="container">
  <div id="header" class="container">
  <div id="logo">
  <h1><a href="#">Miscellaneous Model Functions</a></h1>
</div>
  <!--
  <div id="menu">
  <ul>
  <li class="current_page_item"><a href="#">Homepage</a></li>
<li><a href="#">Blog</a></li>
<li><a href="#">Photos</a></li>
<li><a href="#">About</a></li>
<li><a href="#">Contact</a></li>
</ul>
  </div>
  -->
  </div>
  <div><img src="images/img03.png" width="1000" height="40" alt="" /></div>
  </div>
  <!-- end #header -->
<div id="page">
  <div id="content">
  
<h1>Contents</h1>  
<ul>
  <li><a href="#knn">Yet Another <i>k</i>-Nearest Neighbor Function</a></li>
  <li><a href="#plsda">Partial Least Squares Discriminant Analysis</a></li>
  <li><a href="#bagMARS">Bagged MARS and FDA</a></li>
  <li><a href="#bag">General Purpose Bagging</a></li>
  <li><a href="#avnnet">Model Averaged Neural Networks</a></li>
  <li><a href="#pcannet">Neural Networks with a Principal Component Step</a></li>
  <li><a href="#ica">Independent Component Regression</a></li>
 </ul>   
  
<div id="knn"></div>   
<h1>Yet Another <i>k</i>-Nearest Neighbor Function</h1>

<p><span class="mx funCall">knn3</span> is a function for <i>k</i>-nearest neighbor classification. This particular implementation is a modification of the <span class="mx funCall">knn</span> C code and returns the vote information for all of the classes (<span class="mx funCall">knn</span> only returns the probability for the winning class). There is a formula interface via</p>
  
<!--begin.rcode eval=FALSE  
knn3(formula, data)
## or by passing the training data directly
## x is a matrix or data frame, y is a factor vector
knn3(x, y)
    end.rcode-->
<p> There are also <code>print</code> and <code>predict</code> methods. </p>

<p>For the Sonar data in the <a href="http://cran.r-project.org/web/packages/mlbench/index.html"><strong>mlbench</strong></a> package, we can fit an 11-nearest neighbor model:</p>

<!--begin.rcode MiscKnn1
library(caret)
library(mlbench)
data(Sonar)
set.seed(808)
inTrain <- createDataPartition(Sonar$Class, p = 2/3, list = FALSE)
## Save the predictors and class in different objects
sonarTrain <- Sonar[ inTrain, -ncol(Sonar)]
sonarTest  <- Sonar[-inTrain, -ncol(Sonar)]

trainClass <- Sonar[ inTrain, "Class"]
testClass  <- Sonar[-inTrain, "Class"]

centerScale <- preProcess(sonarTrain)
centerScale

training <- predict(centerScale, sonarTrain)
testing <- predict(centerScale, sonarTest)

knnFit <- knn3(training, trainClass, k = 11)
knnFit

predict(knnFit, head(testing), type = "prob")
    end.rcode-->
 
<p> Similarly, <a href="http://cran.r-project.org/web/packages/caret/index.html"><strong>caret</strong></a> contains a <i>k</i>-nearest neighbor regression function, <span class="mx funCall">knnreg</span>. It returns the average outcome for the neighbor.</p> 

<div id="plsda"></div>  
<h1>Partial Least Squares Discriminant Analysis</h1>

<p>The <span class="mx funCall">plsda</span> function is a wrapper for the <span class="mx funCall">plsr</span> function in the <a href="http://cran.r-project.org/web/packages/pls/index.html"><strong>pls</strong></a> package that does not require a formula interface and can take factor outcomes as arguments. The classes are broken down into dummy variables (one for each class). These 0/1 dummy variables are modeled by partial least squares. 

<p>
From this model, there are two approaches to computing the class predictions and probabilities:
<ul>
   <li>the softmax technique can be used on a per-sample basis to
   normalize the scores so that they are more ``probability like''
   (i.e. they sum to one and are between zero and one). For a vector
   of model predictions for each class <i>X</i>, the softmax class
   probabilities are computed as. The predicted class is simply the class with the largest model prediction, or equivalently, the largest class probability. This is the default behavior for <span class="mx funCall">plsda</span>.</li>
   
   <li>Bayes rule can be applied to the model predictions to form posterior probabilities. Here, the model predictions for the training set are used along with the training set outcomes to create conditional distributions for each class. When new samples are predicted, the raw model predictions are run through these conditional distributions to produce a posterior probability for each class (along with the prior). Bayes rule can be used by specifying <span class="mx arg">probModel</span> = <!--rinline '"Bayes"' -->. An additional parameter, <span class="mx arg">prior</span>, can be used to set prior probabilities  for the classes. </li>
</ul>
 </p>
<p>The advantage to using Bayes rule is that the full training set is used to directly compute the class probabilities (unlike the softmax function which only uses the current sample's scores). This creates more realistic probability estimates but the disadvantage is that a separate Bayesian model must be created for each value of <span class="mx arg">ncomp</span>, which is more time consuming.  
 </p>
<p>For the sonar data set, we can fit two PLS models using each technique and predict the class probabilities for the test set. 
</p>

<!--begin.rcode miscpls,tidy=FALSE
plsFit <- plsda(training, trainClass, ncomp = 20)
plsFit

plsBayesFit <- plsda(training, trainClass, ncomp = 20,
                     probMethod = "Bayes")
plsBayesFit

predict(plsFit, head(testing), type = "prob")

predict(plsBayesFit, head(testing), type = "prob")
    end.rcode-->

<p>Similar to <span class="mx funCall">plsda</span>, <a href="http://cran.r-project.org/web/packages/caret/index.html"><strong>caret</strong></a> also contains a function <span class="mx funCall">splsda</span> that allows for classification using sparse PLS. A dummy 
matrix is created for each class and used with the <span class="mx funCall">spls</span> function in the <a href="http://cran.r-project.org/web/packages/spls/index.html"><strong>spls</strong></a> package. The same approach to estimating class probabilities is used for <span class="mx funCall">plsda</span> and <span class="mx funCall">splsda</span>.
</p>

<div id="bagMARS"></div>  
<h1>Bagged MARS and FDA</h1>

<p>
Multivariate adaptive regression splines (MARS) models, like classification/regression trees, are unstable predictors (Breiman, 1996). This means that small perturbations in the training data might lead to significantly different models. Bagged trees and random forests are effective ways of improving tree models by exploiting these instabilities. <a href="http://cran.r-project.org/web/packages/caret/index.html"><strong>caret</strong></a> contains a function, <span class="mx funCall">bagEarth</span>, that fits MARS models via the <span class="mx funCall">earth</span> function. There are formula and non-formula interfaces. 
</p>
<p>
Also, flexible discriminant analysis is a generalization of linear discriminant analysis that can use non-linear features as inputs. One way of doing this is the use MARS-type features to classify samples. The function <span class="mx funCall">bagFDA</span> fits FDA models of a set of bootstrap samples and aggregates the predictions to reduce noise.
</p>
<p>
This function is deprecated in favor of the <span class="mx funCall">bag</span> function.
</p>
<div id="bag"></div>  
<h1>Bagging</h1>

<p>The <span class="mx funCall">bag</span> function offers a general platform for bagging classification and regression models. Like <span class="mx funCall">rfe</span> and <span class="mx funCall">sbf</span>, it is open and models are specified by declaring functions for the model fitting and prediction code (and several built-in sets of functions exist in the package). The function <span class="mx funCall">bagControl</span> has options to specify the functions (more details below).</p>

<p> The function also has a few non-standard features:</p>
   <ul>
      <li> The argument <span class="mx arg">var</span> can enable random sampling of the predictors at each bagging iteration. This is to de-correlate the bagged models in the same spirit of random forests (although here the sampling is done once for the whole model). The default is to use all the predictors for each model.</li>
      <li> The <span class="mx funCall">bagControl</span> function has a logical argument called <span class="mx arg">downSample</span> that is useful for classification models with severe class imbalance. The bootstrapped data set is reduced so that the sample sizes for the classes with larger frequencies are the same as the sample size for the minority class.</li>
      <li>If a parallel backend for the <strong>foreach</strong> package has been loaded and registered, the bagged models can be trained in parallel.</li>
   </ul>
<p>The function's control function requires the following arguments:</p>

<h2>The <span class="mx funCall">fit</span> Function</h2>

<p>Inputs:
<ul>
 <li> <span class="mx arg">x</span>: a data frame of the training set predictor data. 
   <li> <span class="mx arg">y</span>: the training set outcomes.
   <li> <span class="mx arg">...</span> arguments passed from <span class="mx funCall">train</span> to this function
</ul>
</p>
<p>
The output is the object corresponding to the trained model and any
other objects required for prediction. A simple example for a linear discriminant analysis model from the <strong>MASS</strong> package is:
<!--begin.rcode eval=FALSE,tidy=FALSE
function(x, y, ...) {
   library(MASS)
   lda(x, y, ...)
 }
    end.rcode-->
<h2>The <span class="mx funCall">pred</span> Function</h2>

<p>
This should be a function that produces predictors for new samples.
</p>
<p>Inputs:
<ul>
 <li> <span class="mx arg">object</span>: the object generated by the <code>fit</code> module. 
   <li> <span class="mx arg">x</span>: a matrix or data frame of predictor data.
</ul>
</p>
<p>
The output is either a number vector (for
regression), a factor (or character) vector for classification or a
matrix/data frame of class probabilities. For classification, it is probably better to average class probabilities instead of using the votes of the class predictions. Using the <span class="mx funCall">lda</span> example again:
</p>
  
<!--begin.rcode eval=FALSE  
## predict.lda returns the class and the class probabilities
## We will average the probabilities, so these are saved
function(object, x) predict(object, x)$posterior
    end.rcode-->
    
<h2>The <span class="mx funCall">aggregate</span> Function</h2>

<p>
This should be a function that takes the predictions from the constituent models and converts them to a single prediction per sample.
</p>
<p>Inputs:
<ul>
   <li> <span class="mx arg">x</span>: a list of objects returned by the <code>pred</code> module. 
   <li> <span class="mx arg">type</span>: an optional string that describes the type of output (e.g. "class", "prob" etc.).
</ul>
</p>
<p>
The output is either a number vector (for
regression), a factor (or character) vector for classification or a
matrix/data frame of class probabilities. For the linear discriminant model above, we saved the matrix of class probabilities. To average them and generate a class prediction, we could use:
</p>
<!--begin.rcode eval=FALSE,tidy=FALSE  
function(x, type = "class") {
  ## The class probabilities come in as a list of matrices
  ## For each class, we can pool them then average over them
  
  ## Pre-allocate space for the results
  pooled <- x[[1]] * NA
  n <- nrow(pooled)
  classes <- colnames(pooled)
  ## For each class probability, take the median across 
  ## all the bagged model predictions
  for(i in 1:ncol(pooled))
  {
    tmp <- lapply(x, function(y, col) y[,col], col = i)
    tmp <- do.call("rbind", tmp)
    pooled[,i] <- apply(tmp, 2, median)
  }
  ## Re-normalize to make sure they add to 1
  pooled <- apply(pooled, 1, function(x) x/sum(x))
  if(n != nrow(pooled)) pooled <- t(pooled)
  if(type == "class")
  {
    out <- factor(classes[apply(pooled, 1, which.max)],
                  levels = classes)
  } else out <- as.data.frame(pooled)
  out
}
    end.rcode-->
<p>For example, to bag a conditional inference tree (from the <strong>party</strong> package):</p>

<!--begin.rcode tidy=FALSE,cache=TRUE
library(caret)
set.seed(998)
inTraining <- createDataPartition(Sonar$Class, p = .75, list = FALSE)
training <- Sonar[ inTraining,]
testing  <- Sonar[-inTraining,]
set.seed(825)
baggedCT <- bag(x = training[, names(training) != "Class"],
                y = training$Class,
                B = 50,
                bagControl = bagControl(fit = ctreeBag$fit,
                                        predict = ctreeBag$pred,
                                        aggregate = ctreeBag$aggregate))              
summary(baggedCT)
    end.rcode-->

<div id="avnnet"></div>  
<h1>Model Averaged Neural Networks</h1>

<p>
The <span class="mx funCall">avNNet</span> fits multiple neural network models to the same data set and predicts using the average of the predictions coming from each constituent model. The models can be different either due to different random number seeds to initialize the network or by fitting the models on bootstrap samples of the original training set (i.e. bagging the neural network). For classification models, the class probabilities are averaged to produce the final class prediction (as opposed to voting from the individual class predictions. 
</p>
<p>As an example, the model can be fit via <span class="mx funCall">train</span>:</p>
<!--begin.rcode eval=FALSE,tidy=FALSE
set.seed(825) 
avNnetFit <- train(x = training,
                   y = trainClass,
                   method = "avNNet", 
                   repeats = 15,
                   trace = FALSE) 
    end.rcode-->
    
<div id="pcannet"></div>      
<h1>Neural Networks with a Principal Component Step</h1>

<p>
Neural networks can be affected by severe amounts of multicollinearity
in the predictors. The function <span class="mx funCall">pcaNNet</span> is a wrapper around
the <span class="mx funCall">preProcess</span> and <span class="mx funCall">nnet</span> functions that will run
principal component analysis on the predictors before using them as
inputs into a neural network. The function will keep enough components
that will capture some pre-defined threshold on the cumulative
proportion of variance (see the <span class="mx arg">thresh</span>
argument). For new samples, the same transformation is applied to the new predictor values (based on the loadings from the training set). The function is available for both regression and classification. 
</p>
<p>
This function is deprecated in favor of the <span class="mx funCall">train</span> function using <span class="mx arg">method</span> = <!--rinline '"nnet"' -->  and <span class="mx arg">preProc</span> = <!--rinline '"pca"' --></code>.
</p>

<div id="ica"></div>  
<h1>Independent Component Regression</h1>

<p>
The <span class="mx funCall">icr</span> function can be used to fit a model analogous to
principal component regression (PCR), but using independent component
analysis (ICA). The predictor data are centered and projected to the
ICA components. These components are then regressed against the
outcome. The user needed to specify the number of components to keep.
</p>
<p>
The model uses the <span class="mx funCall">preProcess</span> function to compute the latent
variables using the  <a href="http://cran.r-project.org/web/packages/fastICA/index.html" <strong>fastICA</strong></a> package.
</p>
<p>
Like PCR, there is no guarantee that there will be a correlation
between the new latent variable and the outcomes.
</p>




<div style="clear: both;">&nbsp;</div>
  </div>
  <!-- end #content -->
<div id="sidebar">
<ul>
  <li>
    <h2>General Topics</h2>
    <ul>
      <li><a href="index.html">Front Page</a></li>
      <li><a href="visualizations.html">Visualizations</a></li>
      <li><a href="preprocess.html">Pre-Processing</a><li>
      <li><a href="splitting.html">Data Splitting</a></li>
      <li><a href="varimp.html">Variable Importance</a></li>
      <li><a href="other.html">Model Performance</a></li>
      <li><a href="parallel.html">Parallel Processing</a></li>
    </ul>
    <h2>Model Training and Tuning</h2>
    <ul>
      <li><a href="training.html">Basic Syntax</a></li>
      <li><a href="modelList.html">Sortable Model List</a></li>
      <li><a href="bytag.html">Models By Tag</a></li>
      <li><a href="similarity.html">Models By Similarity</a></li>
      <li><a href="custom_models.html">Using Custom Models</a></li>
      <li><a href="sampling.html">Sampling for Class Imbalances</a></li> 
      <li><a href="random.html">Random Search</a></li> 
      <li><a href="adaptive.html">Adaptive Resampling</a></li> 
    </ul>
    <h2>Feature Selection</h2>
    <ul>
      <li><a href="featureselection.html">Overview</a>
      <li><a href="rfe.html">RFE</a></li>
      <li><a href="filters.html">Filters</a></li>
      <li><a href="GA.html">GA</a></li>
      <li><a href="SA.html">SA</a></li>
    </ul>  
  </li>
</ul>
</div>
<!-- end #sidebar -->
<div style="clear: both;">&nbsp;</div>
  </div>
  <div class="container"><img src="images/img03.png" width="1000" height="40" alt="" /></div>
  <!-- end #page -->
</div>
  <div id="footer-content"></div>
<!--begin.rcode echo = FALSE
knit_hooks$set(inline = hook_inline)    
    end.rcode-->   
  <div id="footer">
  <p>Created on <!--rinline I(session) -->.</p>
  </div>
  <!-- end #footer -->
</body>
  </html>
